-
Notifications
You must be signed in to change notification settings - Fork 334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optional Top Level Keys and Configuration Log Option #96
Optional Top Level Keys and Configuration Log Option #96
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments inline
return bool() | ||
elif key == []: | ||
return list() | ||
elif key == OrderedDict(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This smells funny. As you point out, lists and dicts cannot be keys in a Python dictionary. So why do you allow them as your own keys? Why isn't it just key == 'list'
and key == 'dict'
?
Keep in mind, this will only match if the key is an empty OrderedDict.
Do you mean to use type?
Finally, if key == OrderedDict(),
you return dict(),
meaning you've removed the ordering. Why the discrepancy in types?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@austinbyers it's not key == 'list'
or 'dict'
because in the schema's they are dictated as []
or {}
. Since we load as an OrderedDict, empty {}
is converted to OrderedDict()
. It's mean't to be empty.
json_payload[key_name] = default_optional_values(value_type) | ||
|
||
# Handle jsonpath extraction of records | ||
if config_options and len(config_options) and records_schema: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The len
check is redundant: if config_options
returns True
only if config_options
is not None and is non-empty
test/unit/test_classifier.py
Outdated
'kinesis': { | ||
'data': base64.b64encode(kinesis_data) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This indentation doesn't any sense to me
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
autopep did it 😢
Thanks for the review @austinbyers, I just addressed your comments |
LGTM |
Instead of having optional top level keys for a log type, move all options into 'configuration'. This includes KV/CSV delimiter/separator, json_path, and envelope keys.
The goal of optional_top_level_keys is to provide a mechanism for a more flexible schema. If the key(s) described in optional_top_level_keys exist in the incoming record, they are simply type checked. If they do not exist, a default value is created in place to avoid rules calling keys that do not exist.
…evel_keys support
f1c67b9
to
bd062ea
Compare
to: @airbnb/streamalert-maintainers
size: medium
resolves #95
Library Changes
conf/log
type options under a new key calledconfiguration
hints
stays at the top level, since it's not an option, but rather a guard.delimiter
,separator
move underconfiguration
.envelope
is now calledenvelope_keys
and nested underconfiguration
.records
is now calledjson_path
and also nested underconfiguration
.configuration
key calledoptional_top_level_keys
.pylint
offenses.Other Changes
autopep
helper scriptunit_test.sh
script to test thestream_alert_cli
packageNotes
conf/logs.json
settings if the keys are not updated as described above.